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The invention 

The subject matter of the claimed invention is related to improving the user 
interface of digital devices having Chinese character keys for input. More particularly, 
the invention is related to a method for numerically encoding the Chinese characters by 
decomposing them into six unique code elements where the six code elements are 
mapped onto six numeric keys. 
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The specification to which the oath or declaration is directed has not been 
adequately identified. See MPEP § 602. 

It does not identify the foreign application for patent or inventor's certificate on 
which priority is claimed pursuant to 37 CFR 1.55, and any foreign application 
having a filing date before that of the application on which priority is claimed, by 
specifying the application number, country, day, month and year of its filing. 
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Claims 2, 5, 8 and 10 are rejected under 35 U.S.C. 112, second paragraph, as 
being indefinite for failing to particularly point out and distinctly claim the subject matter 
which applicant regards as the invention. 

Regarding claim 2, the phrase "such as" renders the claim indefinite because it is 
unclear whether the limitations following the phrase are part of the claimed invention. 
See MPEP § 2173.05(d). 

Regarding claims 5, 8 and 10, the phrase "for example" renders the claim 
indefinite because it is unclear whether the limitation(s) following the phrase are part of 
the claimed invention. See MPEP § 2173.05(d). 

Claims 1-11 are rejected as failing to define the invention in the manner required 
by 35 U.S.C. 112, second paragraph. 

The claim(s) are narrative in form and replete with indefinite and functional or 
operational language. The structure which goes to make up the device must be clearly 
and positively specified. The structure must be organized and correlated in such a 
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Edgar H. Sibley The Six-Digit Coding Method (SDCM) is a new coding method for Chinese 
Panel Editor characters. It is based on the structural analysis of Chinese characters. We 

recently developed this method and have successfully used it to code 1 1, 100 
characters, including the simplified, traditional, and variant forms found in 
Xin Hua Dictionary [7]. This article illustrates the basic principles, features, 
and some viewpoints concerning the method. 

Six-Digit Coding Method 

Jinan Qiao, Yizheng Qiao, and Sanzheng Qiao 



In Chinese, there are approximately 2,000-4,000 com- 
monly used characters plus a few thousand more tech- 
nical characters. It is impossible to have one key for 
each character. Thus, a coding system for Chinese 
characters is not only the basis for Chinese information 
interchange, but also an important tool in communica- 
tion, text processing, and many other fields. A good 
coding method can benefit the modernization of China 
directly, since it is essential for a computer input sys- 
tem. More than one hundred Chinese character coding 
systems have been proposed, and some of them have 
been adopted as tools for communication and text pro- 
cessing [2, 3, 8]. They have various shortcomings how- 
ever, and researchers are still looking for better coding 
methods. As computers begin to gain widespread use in 
China, there is an urgent need for a good coding sys- 
tem; yet there is no standard Chinese coding system at 
the present time. 

Chinese characters incorporate shape, sound, and 
complex hieroglyphic meanings into an ideographic 
language, which is different from alphabetic languages 
used in most countries around the world. Conse- 
quently, there are several major difficulties in devel- 
oping a good coding system. We think a good method 
should meet the following four requirements. 

First, versatility is important. There are many forms 
of Chinese characters because of all the changes and 
complications in the long history of the Chinese lan- 
guage. For example, there are the traditional, variant, 
and simplified forms, which are unavoidable in study- 
ing ancient Chinese history. A coding system tailored to 
represent only the simplified form is obviously not good 
enough. How to code all forms of characters is not only 
a major challenge, but also required by the computer- 
ized study of the rich history of China. 

Second, a standard style is essential for a standard 
coding method. There are many different shapes and 
complicated structures, since Chinese characters are 
evolved from ideographs. Common printing fonts in- 
clude the Old Song, Fang Tou, Zheng Kai, Fang Song, 
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and Li Shu; handwriting styles include Li, Zhuan, Cao, 
Hang, Kai, and Mei Shu. Different styles may lead to 
different codes. Therefore, it is necessary to choose a 
widely used style as the standard. 

Third, the One Code, One Character (OCOC) doc- 
trine, i.e., every code should have only one correspond- 
ing character and vice versa, is obviously desirable. At 
present, most coding systems cannot satisfy this. For 
example, the methods based on Pinyin are One Code, 
Multiple Characters (OCMC); i.e., one code has many 
corresponding characters sharing a common pronuncia- 
tion. The Pinyin System consists of 403 different sounds 
[7] that are pronunciations of Chinese characters. Each 
sound can be read in four different tones that may not 
be used as input into a computer system. On the aver- 
age, every sound corresponds to 17 characters with to- 
tally different meanings. For example, in the Xin Hua 
Dictionary, the Pinyin shi represents 78 characters that 
include: city, scholar, food, to lose, matter, wet, poem, 
lion, world, dead body, ten, stone, time, real, to be, to 
drive, to try. The Pinyin fu has 98 different characters, 
ji 119, and yi 131. The context provides the only means 
of distinguishing between them in spoken Chinese. On 
a computer, even if the users are familiar with Pinyin, 
they must look for the desired character among all the 
different characters. Furthermore, China has eight ma- 
jor dialects with totally different pronunciations [4], A 
coding system based on pronunciation can raise severe 
problems in inputting Chinese into a computer system. 
Clearly, OCOC is a requisite in good Chinese character 
coding systems. A major disadvantage of current OCOC 
systems, such as the Standard Code of Chinese Charac- 
ters [6], is that the operator has to find the code in a 
chart that contains thousands of characters each time a 
character is entered. 

Finally, a good coding system should be simple 
enough so that users can quickly figure out the code 
when they see a character. This is termed "See Charac- 
ter, Know Code" (SCKC). In this way, the method can 
be grasped easily without vast knowledge of the 
Chinese language. 

Six-Digit Coding Method (SDCM) is designed to meet 
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all of the above requirements. It is the first coding 
method which is based on the shape of characters. This 
article is organized as follows. We first illustrate the 
principles of this method. The coding rules are briefly 
explained next, and the advantages of this method are 
summarized. Finally, we discuss various viewpoints. 

THE PRINCIPLES 

In SDCM, a Chinese character is dividedjnjlojixjagc^ 
tions. Each section is represe ju^dby^ggS^l digit. All 
six digits then make up a code that denotes the charac- 
ter. The standard form of characters for this method is 
in accordance with a standard, the Font Table of Com- 
monly Used Characters in Printing, jointly published by 
the Ministry of Culture and the Language Reform Com- 
mittee of the People's Republic of China, in Beijing, 
1964. For convenience, we have adopted the most 
widely used Old Song style. A Chinese character can 
consist of single, double, and triple structures or combi- 
nations of these. For example, the character /h is con- 
sidered to have a single structure, the character a 
double structure, and 8fr a triple structure. On this ba- 
sis, we classify each character into one of the following 
four types of structures: 

1. Single and Double Structures 

A character of this type is divided into six sections as 
shown in Figure 1. 

Section 1 is the upper left corner; 
Section 2 is the lower left corner; 
Section 3 is the upper right corner; 
Section 4 is the lower right corner; 
Section 5 is the combination of Section 1 and Sec- 
tion 3; 

Section 6 is the combination of Section 2 and Sec- 
tion 4. 



Section 5 



Section 
1 



Section 
2 



Section 
3 



Section 

4 



First digit 
corresponds 
to section 1 

/ 



Section 6 
The division 



Last digit 
corresponds 
to section 6 



The code 



Figure 1. The Division of a Chinese Character 
and Its Code 



We have categorized about 100 commonly used 
strokes into nine comprehensive groups, and assigned 
digits 1-9 to these groups. The coding table of strokes is 
not included here. 

Each section is coded by a decimal digit that corre- 



sponds to the stroke of the character that occupies the 
section and belongs to the group to which the digit is 
assigned. Unoccupied sections or sections with un- 
grouped strokes are represented by the digit 0. Section 
1 corresponds to the left-most digit; section 2 corre- 
sponds to the second left-most digit; and so on. 

Example 1 {single structure). The stroke assigned to 
group 2, sticks out to the left in section 2, so the second 
digit of the code is 2. Likewise, the fourth and fifth 
digits are 7 and 3 respectively. We usually do not^ode 
a stroke twice, so the code for this character iV02Q73p? 
as shown below. The next section gives more details of 
the coding rules. 



7^ 



section 1 - 
section 2 - / 
section 3 - 
section 4 - \ 
section 5 - } 
section 6 - 



2 4 

0 

7 *6 

3 2 
0 



Example 2 (left-right double structures). Similar to Ex- 
ample 1, each of the sections 1 to 4 is coded by a digit 
that corresponds to the stroke occupying this section. 




section 1 - ^ 
section 2 - / 
section 3 - -H 
section 4 - 
section 5 - 
section 6 - 



4 
2 
6 
5 
0 
0 



Example 3 (upper-lower double structure). Here, sec- 
tion 1 is unoccupied. Section 2 has a stroke that belongs 
to group 9. Similarly, sections 3 and 5 are coded by 1 
and 3 respectively. 



section 1 - 




0 


section 2 - 




9T 


section 3 - 




18 


section 4 - 




0 


section 5 - 


1 


3 


section 6 - 




0 



2. Triple Structure 

In this type, a character is divided into six sections as 
shown in Figure 2. Notice that sections 5 and 6 are 
at the upper half and lower half of the middle structure 
respectively. 



1-1 
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Section 
1 


Section 
5 


Section 
3 


Section 
2 


Section 
6 


Section 
4 



Figure 2. The Division Pattern of Type 2 



Example 4 (triple structure). The coding is shown as 
follows. Although two of the strokes are not coded, the 
OCOC requirement can still be satisfied. 




section 


1 - 




9 


section 


2- 




0 


section 


3- 


r 


2 


section 


4 - 


i 


3 


section 


5- 


I 


3 


section 


6 - 




1 



3. Double Structured Upper Half Plus Triple Struc- 
tured Lower Half 

In this type, a character is divided into six sections as 
shown in Figure 3. 




Example 5. This example shows a character with 
type 3 structure. Again, some strokes are not coded, but 
six digits are sufficient to identify this character. 




section 1 




5 


section 2 


- ✓ 


1 


section 3 


- eJ 


4 


section 4 


-1 


7 


section 5 




0 


section 6 


- 1 


7 



4. Triple Structured Upper Half Plus Single or 
Double Structured Lower Half 

The division pattern is shown in Figure 4. 




Example 6. Here we give an example of type 4 struc- 
ture. 




section 


1 




7 


section 


2 


- * 


1 


section 


3 


- /r 


2 


section 


4 


- U 


1 


section 


5 


. tl 


4 


section 


6 


J 


2 



THE CODING RULES OF SDCM 

All basic building blocks of Chinese characters and 
their corresponding codes are listed in the coding table 
of strokes. A character is first classified into one of the 
four types; then it is coded digit by digit from section 1 
to section 6. If the start or the end of a stroke sticks out 
in a section, then this stroke is chosen to code the 
section. A stroke cannot be coded more than once, un- 
less it crosses other strokes. If it crosses another stroke 
once, it may be coded twice; and if it crosses twice, it 
may be coded three times. In writing Chinese, charac- 
ters are generally written from top to bottom, left to 
right. The start of a stroke is where you begin the 
stroke, and the end of a stroke is where you finish the 
stroke. 

SDCM utilizes this feature in its coding. For example, 
in the character % , there are two strokes starting in 
section 1 , namely land The former is selected to 
code this section because it is higher than the latter. In 
section 2, 7 is chosen because its end is more exposed 
at the lower left corner than I is. Similarly, is 
picked for section 3, and b is selected for section 4. In 
section 5, C is selected again because it is the most 
exposed and it crosses 7 . For section 6, - is coded. 
Thus, according to the coding table, the code for Ji is 
377831. Another example for the crossing rule is the 
character JE, whose code is 111242. 

THE ADVANTAGES OF SDCM 

First, SDCM adopts the Old Song style which is unified, 
widely established, and used in printing. To avoid er- 
rors in coding, we have standardized SDCM by revising 
a few of the irregularly formed characters that resulted 
from printing errors. 
Second, 11,100 characters have been coded with 
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SDCM, and no two characters share the same code. 
This fulfills the OCOC requirement. 

Third, SDCM is the first method of coding based on 
the shape of characters. It is efficient and convenient. 
To code a character, a stroke is selected for each sec- 
tion and then its shape is compared with the strokes in 
the coding table to determine the digit for this section. 
In other words, the code only depends on the positions 
and shapes of the strokes in the characters. It is inde- 
pendent of the order of the strokes in the character. 
Consequently it makes SCKC possible and requires lit- 
tle knowledge of the Chinese language. Furthermore, 
on the basis of psychological analysis done in China, 
people recognize a character by first looking at the ex- 
posed parts of the sections of the characters at the sides 
and corners. If they still cannot tell what the character 
is, they then proceed to look at the subtle differences in 
the middle [5, 9], Therefore, the four corners of the 
character sections, as well as the top and the bottom 
can be effectively used to differentiate between various 
characters. After coding thousands of characters, we 
have found that six sections suffice to identify a charac- 
ter even when some strokes are not coded, as shown in 
Examples 4, 5, and 6 in the previous section. 

Fourth, by using six decimal digits to code Chinese 
characters, we can potentially code one million charac- 
ters and still achieve the OCOC requirement. It is esti- 
mated that there are 50,000-60,000 characters in total. 

Finally, with SDCM, one can code characters quickly. 
We expect that anyone with limited knowledge of 
Chinese would be able to master all the coding rules 
and code characters, without checking the coding table, 
after only two or three days of training. The code can 
then be quickly entered on a numerical keyboard. 

SOME VIEWPOINTS 

Some critics say that a six-digit number is too long for 
one character and that it would slow down the input 
speed. For instance, they argue that the Pinyin System 
requires an average of three keystrokes to input the 
Pinyin of a character, but SDCM always needs six key- 
strokes per character, thus the input speed for Pinyin to 
SDCM is 1:2. The difference here is that the six key- 
strokes in SDCM call up the exact desired character 
because SDCM satisfies the OCOC requirement. In con- 
trast, after the three keystrokes in Pinyin, you have 
identified only the pronunciation of the character, and 
you still have to choose the exact character of interest 
among all the characters that share the same pronunci- 
ation. The actual number of characters to choose from 
can range from three or four to over one hundred, de- 
pending on the specific pronunciation. The average, as 
we noted in the introduction, is seventeen. Since we 
plan to code 50,000 to 60,000 characters with the 
SDCM, six decimal digits are necessary. Moreover, 
SDCM is designed to reduce the amount of memoriza- 
tion. The additional typing can be compensated by the 
efficiency and the convenience of coding with the 
method. 

The SDCM provides a useful tool for Chinese charac- 
ter processing on computers. We believe that the SDCM 
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has a great impact on Chinese information processing 
because it adopts the standard style, achieves One 
Code, One Character, and can be used to code all forms 
of characters efficiently. We have already coded over 
eleven thousand characters, and we plan to develop 
an SDCM software and use neural networks to code 
characters. 
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